THE BIOMETRICS SECTION, AMERICAN STATISTICAL ASSOCIATION 


MISSING-PLOT TECHNIQUES 


R. L. ANDERSON 
Institute of Statistics, University of North Carolina 


In experimental work it frequently happens 
that one or more experimental units is missing 
from the data or has to be rejected because of 
conditions outside the control of the experi- 
menter. It should be cautioned that observa- 
tions should be rejected in the analysis of 
results only under extreme circumstances, 
when it is quite obvious that the treatment 
being studied is not responsible for the ap- 
parently anomal sults, 

One of the first papers on the subject of 
estimating. the yield of a missing unit in 
field experimental work was published by 
Allan and Wishart (1). They derived form- 
ulas and illustrated their use for a single 
Missing plot in a randomized block and in 
a Latin square experiment. These methods 
were extended by Yates (7) to cover several 
Taissing units in a given experiment. 

The formula given by Yates for estimating 
the yield of a single missing unit in a random- 
ized block experiment is 


8.16). 
~ 1) (t=1) 


where r = the number of blocks and t = the 
number of treatments in the experiment, B = 
the total yield of the remaining units in the 
block where the missing unit appears, T = 
the total of the yields of the treatment with 


the missing unit, and G — the grand total: 
Similarly for a single missing unit in a 
Latin square, 


r(R+C+T) —2G 

(r—1) (r—2) 
where r = the number of tows, columns and 
treatments, and R and C represent the total 
yields of the remaining units in the row and 
column in which the missing unit appears. 

Yates uses these formulas for several miss- 
ing units by means of iterative methods, giv- 
ing an example for an 8x10 randomized block 
experiment with 9 missing units, 

He also shows that in a complete analysis 
of variance using the above missing-plot values, 
the treatment sum of squares is over-estimated 
but may be corrected by subtracting the bias. 
The bias in a randomized block experiment 
with one missing unit is 


y= 


1 
ta—1) [B—(t—ly]?. 


With one missing unit of a Latin square, 
the bias in the treatment sum of squares is 


1 be 2 
Yates gives a generalized formula for the bias 


for any number of missing units in a random- 
ized block experiment. In practice, this 


*A square lattice has k? treatments. An orthogonal lattice is one for which the separate groups 


are orthogonal. 


treatment bias is so small that it can be 
neglected. 

Finally Yates presents formulas for the 
variances of treatment means with missing 
units. The difference between the mean of 
a treatment with the yield of a missing unit 
estimated by the above methods and one with 
no missing units has a variance in a random- 
ized block experiment of 


= [ 2+ wpe! 


and in a Latin square experiment of 


[2+ rh 


where o” is the variance of the yield of a 
single unit. o° is estimated by s’. 

When both treatments in a comparison in- 
volye one or more missing units, the formulas 
for variances between means are more com- 
plex because of a correlation between the 
means, Yates presents an approximate 
method for handling this case. 

In subsequent articles, Yates (8,9) des- 
cribes methods of handling Latin square ex- 
periments in which whole rows, columns or 
treatments are missing, or in which one row 
and one column or one row (or column) and 
one treatment are missing. 

The problem of missing-plots becomes much 
more complicated when we consider the var- 
ious incomplete block designs. Cornish has 
described missing-plot formulas for incom- 
plete randomized block designs, simple and 
triple square lattices, cubic lattices and lattice 
squares, when the intra-block variance alone 
was used in estimating the error variance (4). 
Bliss presents an example of the use of the 
above missing-plot formula for lattice squares 
in some corn selection tests (2). 

The missing-plot formulas for any orthog- 
onal square lattice* may be generalized as: 


y= 


where the missing unit occurs in group 1** 
(interchange numbers if in another group.) 
We assume d duplications of an r-group lattice 
(r=2 for a simple lattice and r=3 for a 
triple lattice) with & units per block. T and 


k*(r—1) T-+-rdk (r—1) B—rk (Vo+Va-+ . . .) —k(r—1) Ui4+k(U24Us+ .. 


(r—1) (k—1) (rdk—k—1) 


B are the respective total yields for the treat- 
ment and block with the missing unit, G is 
the grand total, and S; is the total for group 
1. Ve is the total yield in group 2 only of all 
treatments appearing in the same 2-block as 
the treatment with the missing unit (includ- 
ing this treatment); Us is the total yield for 
the entire experiment of the same treatments 
as Vz. Similarly for 1, 3,... These totals 
refer to the following numbers of units: 
T(rd—1), B(k—1), V:(dk), U:(rdk—1), 
Ui (rdk—1), Si(dk?—1), G(rdk*—1) for i= 
Dd Arai ra aot 

Cornish (5) later published two articles on 
methods of handling lattice square experiments 
with missing-plots when the inter-block in- 
formation is also used in estimating the error 
variance. The computations required in these 
analyses are quite complex. We may sum- 
marize the procedure for the square lattices 
as follows: 

(1) Calculate the true intra-block sum of 
squares using the intra-block estimate of the 
missing unit y given above. Call this Sn. 

(2) Calculate the sum of squares for the 
group by treatment inter-action using a com- 
pletely randomized block analysis with the 
randomized block estimate of y. Call this sum 
of squares S (rand). 

(3) Subtract S (rand) from the total sums 
of squares corrected for the general mean of 
the existing values. This gives an unbiased 
estimate of the reduction in sum of squares 
due to groups and treatments. Call this Sor. 

(4) Subtract S (rand) from the corrected 
total sum of squares of existing values. This 
gives the reduction in sum of squares due to 
groups, treatments and blocks. Call this 
Sern. 

(5) Subtract Ser from Sern to give inter- 
block sum of squares, eliminating treatments. 

(6) If only a few of the units are missing, 
the usual formulas for the weights w and w’ 


a —r§,+G 


can be used to estimate the block adjustments. 
Cornish states that if as many as 10% of the 
units are missing, it is doubtful if the experi- 
ment is worth analyzing except in very special 
cases, If an experiment with many missing- 


**Group 1 is usually denoted as Group X, Group 2 as Group Y, etc. 


plots is analyzed, a method of adjusting the 
formulas for w and w’ must be derived. 
Cornish indicates the general attack on this 
problem. 

It appears that the randomized block form- 
ula can be used to estimate the yields of miss- 
ing units for the lattice designs without sacri- 
ficing much information. A study should be 
made of the average relative bias in the error 
variance when the randomized block estimate 
of the yield of a missing unit is used in 
lattice designs. 

The problem of missing units in confounded 
factorial experiments has been studied by 
Cochran. When one or several factors in the 
2" and 3" designs are completely confounded 
(hence, the same design is duplicated in all 
replications), the formula for the yield of a 
missing unit is 


es rB+kT—B' 
G1) k==) 
where r= the number of replications, k= the 


number of units per block, B= the total 
yield of the other (k—1) units in the same 
block as the missing unit, B’= the total yield 
of all r blocks having the same set of treat- 
ments as the block with the missing unit (in- 
cluding this block), and T= the total yield 
of the (r—I) other units which have the same 
treatment as the missing unit. Cochran has 
also developed a formula which holds for 
partial confounding when (a) a different 
replication of the basic plan is used for each 
replication of the experiment and (b) no 
treatment comparison is confounded in more 
than one replication. These results are in- 
cluded in a mimeographed set of notes by 
Cochran and Cox on Design of Experiments 
3). 

If high-order interactions are used as estim- 
ates of error, missing values should be deter- 
mined by the process of minimizing the error 
variance. The basic method used in obtaining 
all of these missing-plot formulas is to let y 
represent the missing value, set up the sum 
of squares for error in terms of y, and then 
estimate y by minimizing this error variance. 
This method is also explained by R. A. 
Fisher (6) with an example of its application 
to a 6x6 Latin square. In certain complicated 
cases, it may be better to use a complete 


least-square solution or to use the covariance 
technique which will be outlined below for 
missing units in split-plot experiments. Yates 
outlines the method of fitting constants by 
least squares in his article on Latin squares 
(8,9). 

Split-plot experiments. The present author 
derived some formulas for missing plots in 
split-plot experiments by minimizing the error 
variance. The method of covariance will be 
used in the derivations which follow in order 
to furnish an easy means for estimating the 
bias in the treatment sum of squares. We 
shall assume that we have a split-plot experi- 
ment with r replication and a whole-plot and 
8 sub-plot treatments so that the total number 
of units is N=rap. 

One sub-plot missing. Let the missing unit 
be that for the whole-plot treatment ai, sub- 
plot treatment b; and replication 1. Also let 
Ax be the total yield of all existing units with 
treatment ax, B; the total yield of all existing 
units with treatment bi, R: the total yield in 
replication ri, (A:Bi) the total yield of all 
existing units with both a: and bi, (RiA:) 
the total yield of all existing units with both 
ri and a: and G the grand total. Let x—0 
and y= the actual yield for the existing units 
and x=—1 and y=0 for the missing unit. 

In the analysis of covariance table, S(x*) 
equals the degrees of freedom divided by N 
in all cases. The cross-product sums S(xy) 
are given in Table 1. 


The best estimate of the yield of the miss- 
ing unit in order to minimize the sub-plot 
error is simply the error b regression coefficient, 


= r(R:A:) +8 (A;:B:) —Ai 
me (r—1) (8—1) 


If this yield is used for the missing unit, 
all sums of squares except that for error b 
will be slightly over-estimated. The unbiased 
estimate of any sum of squares is found by 
first computing a new line in Table 1, which 
is the degrees of freedom and S(xy) for this 
sum plus error 6. The new S(x?), which is 
the degrees of freedom divided by N, and 
the new S(xy) are designated as S(x%) and 
S(x:y). Then the new regression coefficient is 


— S(xy) 
Ss (x1) : 


y 


yi 


Table 1 
Sums of Cross-products for Split-plot Experiment with one Missing Unit. 


Replications r—1 
Treatment A a—l 
Error a (E,) (r—1) (a—1) 
Treatment B p—1 
AXB (a—1) (g—1) 
Error 6 (Ey) = a(r—1) (f—1) 


The bias in estimating the sum of squares 
under consideration is 


(y—y:) *S (x1)? 
Note that the bias is always positive; that is, 
the sum of squares is always over-estimated 
in the analysis of variance. 
Thus, for the treatment B, 


ra(R1A1) +48 (AiB:)—aAi:—8B:+G 
(B—1) (ra—a-+1) 
and S(x:°) =(8—1) (ra—a—1) /raf. 
For the interaction AB, 


y= ra(RiA;) +6B:—G 
(ra—1) (B—1) 
and S(x:°?)=(ra—1) (s—1) /rag. 

An exact treatment of the whole-plot analy- 
sis is somewhat more complicated. It is un- 
likely that the test of significance for treat- 
ment A would be seriously affected by the 
slight biases introduced by use of the missing- 
plot value. One possibility of obtaining more 
exact estimates of the sums of squares for 
treatment A and for error a would be to 
minimize the sum of squares for error a and 
calculate the true sum of squares for treat- 
ment A on this basis. If this were done, the 
estimate of the missing value would be 


y=— Ta (R:Ax) —rR:—aAi+G 

(r—1) (a—1) 
However, this gives the same result as would 
be obtained by considering the entire whole 
plot (R:A:) to be missing, which does not 
seem justified. A second method would be to 
use the above covariance technique for B and 


n= 


—TIa (RiA;) +rR:+aA:—G 


44 


_B 


N 
G 


ra N 
ap (AxB:) +aA:+8B;—G 
N 


ra(RiA:) +as (AiBi) —aAx 
N 


AB to obtain unbiased estimates of the sum 
of squares for both A and error a. However, 
these two adjusted sums of squares would no 
longer be independent and the F-test could 
not be used to test the significance of the A 
differences. 

As an example of the missing-plot technique 
with a single missing unit, consider the follow- 
ing wheat straw yields from 4 top dressings 
of nitrogen (0, 15, 30 and 45 pounds) each 
replicated 4 times, constituting the 16 whole 
plots, and 4 types of fertilizers on the sub- 
plots (0-0-0, 15-0-0, 0-40-40, and 15-40-40.) 
The experiment was conducted by the N. C. 
Agricultural Experiment Station as part of a 
Cooperative Fertilizer Experiment. The yields 
are grams of straw per plot. 

Using the missing plot formula, 


we 4(1930) +4 a0) —9080 —763 


The analysis of variance for these data, 
using y=763 and without correction for bias, 
is presented in Table 3. 

Since the treatment B effect is highly sig- 
nificant (and far beyond the 1% point) and 
the AB interaction is definitely non-significant, 
we need not be concerned with the bias in 
the estimates of the sums of squares. For 
purposes of illustration, we have computed 
the actual bias in each sum of squares. For 


B, the bias is 18,238 (a 7% bias); and for — 


AxB the bias is 340 (1%). These would be 
subtracted from the values in Table 3 for un- 
biased estimates. 


Table 2. 
Yields of Wheat Straw in a Split-plot Experiment. 


Replications 
lb. N Fertilizer 
Top Dress Type of I II Til IV Total 
0 0- 0- 0 332 260 202 210 1004 


15- 0- 0 346 334 232 228 1140 
0-40-40 340 236 250 217 1043 
15-40-40 454 476 346 384 1660 


Total 1472 1306 «©1030 »=«1039.-Stst«B AT. 


15 0- 0- 0 412 384 362 348 1506 
15- 0- 0 420 606 366 426 1818 

0-40-40 540 410 380 604 1934 

15-40-40 604 522 508 510 2144 

Total 1976 1922 1616 1888 7402 

30 0- 0- 0 542 472 516 458 1988 


15- 0- 0 638 572 652 550 2412 
0-40-40 693 538 434 614 2279 
15-40-40 844 708 744 706 3002 


Total 2717 «2290S s«2346. = «2328~Sts«é BT. 


45 0- 0- 0 730 590 294, 560 2174 2 
15- 0- 0 664 616 418 702 2400 
0-40-40 740 724 454 532 2450 


15-40-40 738 y 650 668 2056 =(A:B:) 
Total 2872 1930= 1816 2462 9080 =A, 
(RiAd) 
Total 9037 7448=R, 6808 7717 31010 =G 


Bi=1660-+2144+3002-+4+2056--8862 


ee es 


Table 3. 
Analysis of Variance of Wheat Straw Data. 


Degrees of Sum of Mean 
Freedom Squares Square 

Whole plots 

Replications 3 162,998 54,333 

Top Dressing (A) 3 1,031,784  343,928** 

Error a 9 80,644 8,960=E, 
Sub-plots 

Seedings (B) 3 283,167 94,389* * 

AXxB 9 29,391 3,266 

Error b 35 151,950 4,341=E, 


Entered as second-class matter, May 25, 1945, at the post office at Washington, D. C., under 
the Act of March 3, 1879. The Biometrics Bulletin is published six times a year—in February, 
April, June, August, October and December—by the American Statistical Association for its 
Biometrics Section. Editorial Office: 1603 K Street, N.W., Washington 6, D. C. ; 

Membership dues in the American Statistical Association are $5.00 a year, of which $3.00 
is for a year’s subscription to the Quarterly Journal, fifty cents is for a year’s subscription to 
the ASA Bulletin and members who pay $1.00 additional receive a year’s subscription to the 
Biometrics Bulletin. Dues for Associate members of the Biometrics Section are $2.00 a year, 
of which $1.00 is for a year’s subscription to the Biometrics Bulletin. Single copies of the 
Biometrics Bulletin are 60 cents each and annual subscriptions are $2.00 Subscriptions and 
applications for eo should be sent to the American Statistical Association, 1603 K 
Street, N.W., Washington 6, D. C, 


45 


Similarly the A treatments are definitely 
different regardless of any slight bias. The 
bias, as computed by the second method, in 
the sum of squares for error a is 2531 (3%), 
and for treatment A is 23,172 (2%). 

If it is desired to test the significance of the 
difference between two mean yields, one with 
and one without a missing unit, the usual 
formulas for the variance between two means 
do not hold. The following formulas give 
the variances of the difference between two 
such means (assume a2 and be represent 
treatments with no missing units.) 


2Ep 


of 33.9. 

One whole-plot missing. The whole-plot 
analysis can be made on a randomized block 
basis (see page 41), using the same missing- 
plot formula. The treatment A bias and the 
variance of the difference between two treat- 
ment means must be divided by §, if the 
results are put on a sub-plot basis. 

For the sub-plots, the method of propor- 
tionate sub-class numbers can be used to 
evaluate the B and AB sums of squares. The 
sub-plot error (E>) can be obtained by sub- 
tracting the variation between existing whole- 


bbe: [1+ | = — A [144 5 3 

ia: [Etsy = = a [oxo 08 =1150 

Sia = (tts gop! = = =e ae | tea —4 | =2653 

asbs—aabi: a +45 [@- D+3G— eae ee ee 18 ) =3230 


In these formulas, the unbiased E, should 
have been used; however, the bias is usually 
so small that the difference would not be 
important. For example, in the above, Ey 
would have been 8679 instead of 8960. Hence 
the variance of (ai—az)=1115. The standard 
error of this difference would be 33.4 instead 


plots plus the B and AB variations from the 
total variation between existing plots. In 
other words, the analysis of the sub-plots 
need not concern itself with the missing units. 
In terms of the notation given in the previous 
section for one missing sub-plot, the various 
sums of squares are as follows: 


eames) 
*ra—l ~—sB (ra—1) 
a a 
e >S(A;B)* hee Rs 2 A%i ee a 
S(A.B)* 2 = 
AB: r—l Tr 8(r—1) ap. Ae ra-1 + 8(ra—1) 
2 
Ep: Sy?— a —(sum of squares for B and AB.) 
The variances of the differences between sub-plot treatment means are as follows: 
2E» 
bi—be: ah 
—_ —— 2E» 
asbi—arba: r—1l 
hy an ry 
asbi—aabo: r ' 


aba: (HE PE)(L 


aa Gre) 
r B 


asbi—asbi : 


+721) 


46 


REFERENCES CITED 


Allan, F. E. and J. Wishart. 


Bliss, C. I. and R. B. Dearborn. 
New England and Pennsylvania. 


Statistics, Raleigh, N. C. 


Wy CO AS ee 


ments. Ann. Eugen. 10: 112-118. 1940. 


experimental work. Jour. Agr. Sci. 20, Pt. 3, 399-406. 


The efficiency of lattice squares in corn selection tests in 
Proc. Am. Soc. Hort. Sei, 41:324. 1942. 


Cochran, W. G. and Gertrude Cox. Experimental designs. 


A method of estimating the yield of a missing plot in field 


1930a. 


(Mimeographed.) Institute of 


Cornish, E. A. The estimation of missing values in incomplete randomized blocks experi- 


Cornish, E, A. The estimation of missing values in quasi-factorial designs. Ibid. 10:137-143, 


Cornish, E. A. The analysis of quasi-factorial designs with incomplete data. 1. 
Inst. Agr. Sci. 6:31-39. 


Randomized Bl. Jour. Aust. 


Incomplete 
1940c. 


Cornish, E. A. The analysis of quasi-factorial designs with incomplete data. 2. Lattice 
Squares. Ibid. 7:19-26. 1941a. Wah 
Cornish, E, A. The analysis of quasi-factorial designs with incomplete data. 3, Square, 


Triple, and Cubic Lattices. 


(Unpublished) 


1941b, 


5. Cornish, E. A. The recovery of inter-block information in quasi-factorial designs with in- 


complete data. 1. Square, Triple, and Cubic Lattices, 
Bul. 175. 1944. Coun. Sci. Ind. Res. (Aust.) 


Fisher. R. A. The design of experiments, 
Sec. 58.1. 1937. 


Emp. Jour. Exp. Agr. 1:129-142. 1933, 


tatteh Scsh £2) 


Yates, F. and R. W. Hale. 
or treatments are missing, 


Bul. 158 (1943). 2. Lattice Squares, 


Second edition. Oliver and Boyd, Edinburgh. 
Yates, F. The analysis of replicated experiments when the fields results are incomplete. 
Yates, F. Incomplete Latin Squares. Jour. Agri. Sci. 26 Pt. 2. 301-315. 1936. 


The analysis of Latin squares when two or more rows, columns 
Supp. Jour. Royal Stat. Soc. 6: No. 1:67-79. 


1939. 


LIMITATIONS OF THE APPLICATION OF 
FOURFOLD TABLE ANALYSIS TO HOSPITAL DATA* 
JosePH Berkson, M.D., 


Division of Biometry and Medical Statistics, Mayo Clinic, 
Rochester, Minnesota 


In the biologic laboratory we have a method 
of procedure for determining the effect of an 
agent or process that may be considered typ- 
ical. It consists in dividing a group of animals 
into two cohorts, one considered the “experi- 
mental group,” the other the “control.” On 
the experimental group some variable is 
brought to play; the control is left alone. 
The results are set up as in table l-a. If the 
results show that the ratio a:a+-b is different 
from the ratio c:c+-d, it is considered demon- 
strated that the process brought to bear on 
the experimental group has had a significant 
effect. 


A similar method is prevalent in statistical 
practice, which I venture to think has come 
into authority “because of its apparent equiva- 
lence to the experimental procedure. In Bio- 
metrika it is referred to as the fourfold table 
and it is used as a paradigm of statistical 
analysis. The usual arrangement is that given 


in table 1-b. The entries, a, 6, c and d are 
manipulated arithmetically to determine 
whether there is any correlation between A 
and B. A considerable number of indices 
have been elaborated to measure this correla. 
tion. Pearson has given the formula for cal- 
culating the product-moment correlation co- 
efficient from a fourfold table on the assump- 
tion that the distribution of both variates is 
normal; Yule has an index of association for 
the fourfold table; there are the chi-square 
test and others. In essence, however, all these 
indices measure in different ways whether 
and how much, in comparison with the varia- 
tion of random sampling, the ratio a:a+b 
differs from the ratio c:c+d. If the difference 
departs significantly from zero, there is said 
to be correlation, and the correlation is the 
greater the greater the difference. 


Now there is a distinction between the 
method as used in the laboratory and as 


*This paper was presented in somewhat different form at a meeting of the American Statistical 
Association in 1938. Recent inquiries have prompted its publication at this time. 


applied in practical statistics. In the experi- 
mental situation, the groups, B and not B, 
are selected before the subgroupings, A and 
not A, are effected; that is, we start with a 
total group of unaffected animals. In the 
statistical application, the groupings, B and 
not B, are made after the subgroupings, A and 
not A, are already determined; that is, all the 
effects are already produced before the inves- 
tigation starts. In the end, the tables of the 
results which are drawn up look alike for the 
two cases, but they have been arrived at 
differently. Correlative to this difference, a 
different interpretation may apply to the re- 
sults, and this paper deals with a specific case 
of a kind that arises frequently in a medical 
clinic or a hospital. I take an example. 


There was prevalent an impression that 
cholecystic disease is a provocative agent in 
the causation or aggravation of diabetes. In 
certain medical circles, the gall bladder was 
being removed as a treatment for diabetes. 
The authorities of a hospital wish to know 
whether their accumulated records of inci- 
dence, examined statistically, support this 
practice. On the face of it, it would appear 
that we have here the typical and elementary 
problem of the comparision of rates in a four- 
fold table. The total population of patients 
for a period is to be divided into two groups, 
“diabetes” and “no diabetes” and the rate of 
incidence of cholecystitis in the one compared 
with the rate in the other. Accordingly, table 
2 was set up. 


Table 2 shows a significant difference in- 
dicating positive correlation between chole- 
cystitis and diabetes. An objection which 
might be brought against this particular 
tabulation is that the “not diabetes” group 
consisting as it does of all patients without 
diabetes, will contain a variety of diagnoses, 
some of which may themselves be correlated 
with cholecystitis, even as diabetes may be; 
hence the control may be considered not good. 
To meet this objection we do not select for 
the control group the entire nondiabetic pop- 
ulation, but take a diagnosis which cannot 
reasonably be thought to be correlated with 
cholecystitis and use this as a criterion for 
the control group. I took, in fact, several 
refractive errors of the sort for which patients 


come to the clinic for glasses as such a diag- 
nostic group, and table 3 was the result. 


Again we see that the difference is positive 
and significant in comparison with the prob- 
able error, and the usual judgment would be 
that cholecystitis and diabetes are positively 
correlated. Of course, in any detailed analysis 
we should wish to keep age and sex constant, 
inquire into the reliability of the diagnoses, 
and so forth. But the point referred to in 
this paper has no relation to such questions, 
and for the sake of the argument we shall 
consider that all such factors have been ad- 
equately controlled. Even so, do the results 
permit any conclusion as to whether chole- 
cystitis is biologically correlated with dia- 
betes? 


Since the hospital population comes’ from 
the general population, let us begin there. 
For the sake of simplification, we shall con- 
sider only the three diseases referred to, 
cholecystitis, diabetes and refractive errors. 
If the incidence of these conditions in the 

&general population is represented by pe, pa and 
pr and there is no correlation between the 
diseases, we have for the constitution of the 
population the expressions shown in table 4, 
in which na is the number having diabetes but 
not having cholecystitis nor refractive errors, 
nac those having diabetes and cholecystitis but 
not having refractive errors, naer those having 
diabetes, cholecystitis and refractive errors, 
no those having none of these diseases, and so 
forth. JN is the total population. If we assume 
for illustrative purposes, a population of 
10,000,000 persons, and pa = 0.01, pe = 0.03, 
and p, = 0.10, the numbers of the various 
constituents are given in table 4. From these 
figures we may set up two fourfold tables as 
before (table 5). 


In both parts of table 5 it is seen that the 
difference of the pertinent ratios is zero, 
which is as it should be, since there is no 
correlation. This result, of course, could have 
been foreseen without this computation but I 
desired to establish the numbers for use later. 
Now suppose we follow that portion of the 
population which gets to the hospital. For 
this purpose we must develop some elementary 
relationships. 


We shall suppose that associated with each 


particular disease is a definite probability that We see here that though in the general 
its victims will be selected for the hospital. population, the incidence of cholecystitis was 
That is, we shall suppose that a person who identical among the persons who had diabetes 
has cholecystitis has a certain definite proba- and the persons who had refractive errors, in 
bility of being drawn to the hospital because the hospital population the incidence was less 
of the presence of that disease alone, and so in the diabetic group than in the control group, 
for other diseases. Furthermore, for sim- giving an appearance of a small negative 
plicity we shall say that these selective proba- correlation, and this in the face of the fact 
bilities operate independently, as though a that we have assumed equality of selective 
person who had two diseases were like Siamese rates for the various diseases. 
twins, each one of whom had one disease, so In general the selective rates can be as- 
that the probability of the twins’ coming to sumed to be anything but equal for different 
the hospital is the probability of either one diseases. Various circumstances, such as the 
getting there, but the presence of one disease severity of the symptoms, the amenability of 
does not affect the other in any way. Let the disease to treatment by a local physician 
the selective rates be represented by si, s2, ss, or the reputation of a particular hospital for 
and so forth and their complements (1 — s) treatment of particular diseases, will deter- 
be represented by 1, tz, ts, and so forth, the mine the probability that a specific disease 
-number in the general population by n and _ will bring its victim to a particular hospital. 
the number in the hospital by n’. Then, we To see the effect of a variation in selective 
have the following equations: rates, let us hypothesize some values which 

n‘s=n; (l—t:) =n1(s1) 

D'y=Nis (1—t: t2) =Niz (si-t+ss—si s2) 

N'19g=Ni93 (1—titets) M129 (sit-se+ss—siS82—siSs—S28s-+S1828s) 

From these relationships an interesting con- will differ among themselves as follows: 


clusion can at once be drawn. Suppose all s, = 0.15, sa = 0.05, s- = 0.20. The resulting 
the s’s are equal, but small; then the follow- numbers of the various constituents of the 


ing ratios will result: population that will come into the hospital 
T'12 Nhe Nig 
= —  (2—s) =approxim — X2 
= oe (2—s) =approximately, = 
Nig _—sDaa eRe Dies «x3 
POR ee (3—3s-++s ) =approximately,7 


From these equations it is seen that the ratio are shown in table 8 and the fourfold table 
of multiple diagnoses to single diagnoses in drawn up from these figures is given as table 
the hospital will always be greater than in 9. 
the general population; for two diagnoses We now find that the incidence of chole- 
the ratio will be about twice that of the general cystitis in the diabetic group is about twice 
population, for three diagnoses about three that of the control. This would show, so far 
times, and so forth. as the hospital population is concerned, a 
Let us now apply the appropriate factors positive correlation between cholecystitis and 
of selection to the various constituents of the diabetes, but it would be quite unrepresenta- 
hypothetical general population which have tive of the situation in the general population 
been enumerated. Assuming as a simple in- and of no biologic significance. 


stance that all the selective probabilities are The relationships dealt with arithmetically 
equal and have the value 0.05, the frequencies in the previous tables are given algebraically 
given in tables 6 and 7 will result. as follows: 


piqds (1—tits) +-pips (1—titets) 
P1qs (1—tte) -+-qugqs (1—ta) +pips (1—titats) +psqu (1—tets) 
ee pi(1—tits) 
Pe pa (tate) on Ets) 


Pia 


49 


Where 

p'x.2 is the incidence in the hospital popula- 
tion of condition 1 among persons who have 
condition 2 

p's.s is the incidence in the hospital popula- 
tion of condition 1 in the control group who 
have condition 3 

Pi, Pa, and ps are the independent probabili- 
ties in the general population of conditions 
1,2 and 3,q=1-—p 

ti, te, and ts are the complements (1—s) 
of the independent selective probabilities si, 
s2 and ss applying to condition 1, 2 and 3 


Comment 


The assumption made in the text that a 
probability can be assigned to every disease, 
which gives the chance that a patient suffering 
from that disease alone, will come to the 
hospital is, I think, in general accord with 
the actual mechanism by which such a patient 
is selected for the hospital population. The 
assumption that these probabilities operated 
independently in an individual who is suffer- 
ing from more than one disease is doubtless 
oversimple. In general we may guess that if 
a patient is suffering from two diseases, each 
disease is itself aggravated in its symptoms 
and more likely to be noted by the patient. 
So far as this difference of fact from assump- 
tion goes, its effect would be to increase 
relatively the representation of multiple diag- 
noses in the hospital, and in general to increase 
the discrepancy between hospital and parent 
population, even more than if the probabilities 
were independent. 

It appears from the development that it is 
hazardous to apply in a hospital population 


the method of the fourfold table analysis for 
an inquiry into the correlation of diseases. 
This applies also to other similar problems, 
as for instance whether the incidence of say, 
heart disease, is different for laborers and 
farmers, if it is known that laborers and 
farmers are not represented in the hospital 
in the proportion that they occur in the com- 
munity. However, the formulas given indi- 
cate some special cases in which comparison 
is not basically invalid. If the selective rate 
for any particular condition is zero, the rela- 
tive incidence of that condition in several 
disease groups may be validly examined, re- 
gardless of the selective rates affecting the 
other groups. This refers to inquiries in 
which for instance eye color or anthropologic 
type is examined in various disease groups to 
ascertain whether there is correlation between 
these characters and disease. If each of the 
disease groups examined consists of only one 
disease, for example, diabetes or refractive 
errors but not both, and if the selective rates 
for these two groups do not differ appreciably 
then also it is valid to compare the incidence 
in them of cholecystitis, even though the latter 
disease is not fairly represented in the hospital. 

Except for such cases there does not appear 
to be any ready way of correcting the spurious 
correlation existing in the hospital population 
by any device that does not involve the acqui- 
sition of data which would themselves answer 
the primary question. For instance the de- 
vice sometimes used of setting up in the 
hospital sample a one-to-one control so that 
both groups examined have the same number 
of cases and are identical as regards say, age 
and sex does not touch the difficulties referred 


Table 1 
Fourfold Tables 


a 


Typical of experimental situation 


No 
Effect | effect 


Group 
Experimental 


Control 


Total 


a+b+c-+d || Total 
50 


b 
Statistical form 


b+d |a+b+c+d 


to here. It is to be emphasized that the 
spurious correlations referred to are not a 
consequence of any assumptions regarding 


of the ordinary compounding of independent 
probabilities. The same results as shown here 
would appear if the sampling were applied 


biologic forces, or the direct selection of cor- to randomly distributed cards instead of 
related probabilities, but are the result merely _ patients. 
Table 2 
Relation of cholecystitis to diabetes—hospital population 
Not A 
reo Not Total 
Cholecystitis cholecystitis 
B: Diabetes 28 548 576 
Not B: Not diabetes 1,326 39,036 40,362 
Total 1,354 39,584 40,938 
Cholecystitis in diabetic group 4.86% 
Cholecystitis in control group (not diabetic) 3.28% 
Difference +1.58%+0.5% 
Table 3 
Relation of cholecystitis to diabetes—hospital population, 
refractive errors used as control 
A Not A 
Cholecystitis Not Total 
y cholecystitis 
Diabetes 28 548 576 
Refractive errors 68 2,606 2,674 
Total 96 3,154 3,250 
Cholecystitis in diabetic group 4.86% 
Cholecystitis in control group 2.54% 
(refractive errors) 
Difference +2.32%+0.5% 
Table 4 
Constitution of general population, 
various diseases 
Da =paqeqrXN= 87,300 
Ne =peqaqgrXN= 267,300 
8 Pape an pte N = 10,000,000 
ae pik AF Te pa=0.01, pe=0.03, pr=0.10 
aha aaaae tend qa=0.99, qe=0.97, qr=0.90 
Dee =—peprdaxcn—. 29,700 
Nacr—=DPaPepr X N= 300 


Do =qageqrX N=8,642,700 


si 


Table 5 
Cholecystitis and diabetes, general population 


Not Not 
Cholecystitis, cholecystitis! Total Cholecystitis! cholecystitis) Total 
Diabetes 3,000 97,000 100,000)|Diabetes 3,000 97,00) 100,000 
Not Refractive 
diabetes | 297,000 9,603,000 9,900,000 errors 29,700 960,300 990,000 
Total 300,000 9,700,000 /|10,000,000)|Total 32,700 1,057,300 1,090,000 
Cholecystitis in diabetic group 3%)|| Cholecystitis in diabetic group 3% 
Cholecystitis in control group 3%)|| Cholecystitis in control group 3% 
(nondiabetic) : (refractive errors) 
Difference 0% Difference 0% 
Table 6 
Enumeration of hospital population for sa=se—=sr=0.05 
General population Hospital population, 
numbers fe expected numbers 
Na" 07,000 0.05 na = - 4,305 
ne = 267,300 0.05 Tice ls oOo: 
nr = 960,300 0.05 n'y = 48,015 
Nice = 2,700 0.0975 Tac) = 265 
Bac 9,700 0.0975 nar = 946 
Dee 29-100 0.0975 Ter eee OOO 
Dacr = 300 0.142625 N'acr = 43 
No = 8,642,700 0 no = 0 


*The fraction of the specified individuals which is selected for the hospital under the operation 
of the selective forces s. It is equal to 1 minus the products of the appropriate t’s; for example 
facr=1—tatot,. 


Table 7 


Cholecystitis and diabetes, hospital population: 
expected numbers for so=sa—=sr—0.05 


Not 
Cholecystitis |cholecystitis| Total 
Diabetes 306 5,311 5,617 
Refractive errors 2,896 48,015 50,911 


Total 3,202 53,326 56,528 
Cholecystitis in diabetic group 5.45% 


Cholecystitis in control group 5.69% 
(refractive errors) 


Difference —0.24% 
52 


Table 8 
Enumeration of a hospital population for 
So=0.15; sa—0,05,. S-—0120 


Hospital 

General population, 

population expected 

numbers f numbers 
na = 87,300} 0.05 na = 4,365 
ne = 267,300] 0.15 | n'e = 40,095 
nr = 960,300} 0.20 n'; 192,060 
Nac = 2,700} 0.1925 Dae 520 
Mar = 9,700) 0.24 Dar = 2,328 
Der = 29,700) 0.32 Her = 9,004 
Nacr—= 300} 0.354 N'acr—= 106 
No =8,642,700| 0 Ao 0 


Table 9 


Cholecystitis and diabetes, hospital population 
expected numbers for so=0.15, sa=0.05, s-=0.20 


Not 
Cholecystitis |cholecystitis] Total 
Diabetes 626 6,693 7,319 
Refractive errors 9,504 192,060 201,564 


10,130 198,753 208,883 


Total 


Cholecystitis in diabetic group 8.55% 
Cholecystitis in control group 4.72% 

(refractive errors) 
Difference +3.83% 


STANDARD ERRORS OF YIELDS ADJUSTED FOR REGRESSION 


The 


variance. 


studied on each experimental unit, and x is a 


precision 
mean results for the several treatments of a 
well-planned experiment can often be in- 
creased by application of the analysis of co- 


ON AN INDEPENDENT MEASUREMENT 


D. J. FINNEY 


Lecturer in the Design and Analysis of Scientific Experiment, 
University of Oxford. 


second measurement on each unit, itself un- 
affected by the treatments under test but 
showing a significant error correlation with 
y, the regression of y on x is used to adjust 
the means for each treatment; if the regression 
of y on x is highly significant, the standard 


53 


of comparisons between 


If y represents the measurement 


errors of differences between the adjusted 
treatment means may be substantially less 
than for the unadjusted means. For example, 
in animal feeding trials weight gains may be 
adjusted so as to take account of differences 
in initial weight that, providing the assign- 
ment of animals to treatments was random, 
could not be associated with treatments, or 
crop yields in field-plot tests may with ad- 
vantage be adjusted for differences in plant 
density when variations in the latter are con- 
siderable but are independent of treatment. 

If ¢ treatments are each tested in r-fold 
replication, in a simple randomized block or 
Latin square design, and give mean yields 
yu(u=1, 2, ... t), the means adjusted for 
regression are 

Yu'=yu— b(xu—x), 

where the xu are the treatment means for the 
independent variate and 6 is the regression 
coefficient estimated from the error line of the 
analysis of variance and covariance. Now the 
variance of a difference between two unad- 
justed means is 


V(y:—y2) =22, (1) 


where s’ is the error mean square for the 
analysis of variance of y; this quantity is, of 
course, the same for every pair of treatments. 
The variance of the difference between the 
corresponding adjusted means is 


V(ys'—y:') =0(+ aie a) (2) 


where s” is the residual error mean square 
for y after removal of the regression component 
and A is the error sum of squares in the x 
analysis. Since this variance depends upon 
the pair of treatments compared, the presen- 
tation of standard errors in summary tables 
of means is made difficult. Often the second 
term in the expression on the right of equa- 
tion (2) is negligible by comparison with the 
first, so that little fault is committed by 
assigning a standard error s’‘/Vr to each ad- 
justed mean. This procedure, however, leads 
to consistent underestimation of standard 
errors, since the neglected term is necessarily 
positive, and should therefore not be employed 


unless all differences of the type (xi—x2) are 
very small. 


A simple modification, which the writer 
has found useful when separate computation 
of equation (2) for every interesting compari- 
son seemed unnecessary, is to remove the bias 
by inserting an average value for (xi:—x2)*. 
The average value of (xu—xy)* over all pos- 
sible pairs of different treatments can easily be 
shown to be 2a/r(t—1l), where a@ is the sum 
of squares for “treatments” in the analysis of 
variance for x. Hence, with this approxima- 
tion, each treatment mean may be assigned a 
variance 


s? a 
V(yn') = “(ita;55) (3) 


and twice this amount may be taken as the 
variance of the difference between any pair 
of adjusted treatment means. It may further 
be shown that the variance of the difference 
between a mean yield for one set of kz treat- 
ments and another set of ke, averaged over 
all possible selections of the k: and ks, is 


1 
Aethe V(ya') =V (+z) , 
and is the same as would be obtained by 
direct and uncritical use of equation (3). 
The average efficiency of comparisons be- 
tween adjusted treatment means relative to 
comparisons between unadjusted means is 


on: aay a 
“(1+agty) 


which quantity therefore represents the over- 
all gain in information on treatment contrasts 
through taking account of the independent 
variate. Providing that treatments have had 
no effect on x, the expected value of the factor 


(+4757) is (1+ 4) 4 


where e is the number of error degrees of 
freedom; this should not normally be used 
instead of the experimental value, as con- 
siderable variations may occur unless both 
sums of squares, a and A, are based on large 
numbers of degrees of freedom. 


In an experiment of factorial design, the 


factor ( 1 ) could be calculated 


A(t—1) | 


separately for each main effect and interaction, 
taking @ as the treatment sum of squares and 
(t—1) as the corresponding number of de- 
grees of freedom, but often an over-all value 
computed for the total of all treatment sums 
of squares will be sufficiently good, since, if 
the treatments are all without effect on x, the 
mean squares a/(t—l) may be expected to 
be about the same size; of course, in calculat- 
ing this sum of squares, allowance must be 
made for any partial or total confounding of 
treatment contrasts. For more complicated 
designs, such as those of the lattice and other 
incomplete block types, the treatment con- 
trasts are obtained as weighted means of 
intra- and inter-block comparisons. As 


Cochran (1940) has said, the two kinds of 
comparison should be adjusted separately 


for their own error regressions. If a com- 
posite summary table is required, this may be 


formed as a weighted mean of the two kinds 
of contrast, each being weighted inversely as 
the average variance of comparisons adjusted 
for the regression. Each variance must con- 
tain the appropriate factor corresponding to 


Poe Bee 
( +zeyp); here a and A must be taken 


from the parts of the analysis of variance of 
x for intra- and inter-block comparisons sep- 
arately, together with (t—1l), the treatment 
degrees of freedom in each part of the analy- 
sis. The standard errors of the weighted 
means will only be approximately correct, on 
account of the regression adjustments having 
introduced correlations between intra- and 
inter-block components, but the degree of 
approximation will generally be sufficiently 
good. 


REFERENCE 


COCHRAN, W. G. (1940). 


Varietal Tests. II. Mathematical Theory. 


The Analysis of Lattice and Triple Lattice Experiments in Corn 
Iowa Agr. Exp. Sta., Res. Bull. 281. . 


QUERIES 


QUERY: May distribution variances be used 
directly to form a ratio equivalent to an F 
ratio, such that probability may then be read 
directly from conventional F tables, in testing 
the significance of differences between two 
variances? If so, what are the assumptions 
involved in, and the limitations imposed upon 
the use of such a method? 

It may be that the answer to the above 
question will require more specific information 
as to the particular circumstances of the ap- 
plication. Hearing thresholds are taken in a 
group of ears before and after exposure to 
aircraft noise. In addition to differences be- 
tween means before and after we are interest- 
ed in knowing whether exposure to noise has 
changed the population variance significantly 
(as an index of the extent to which there 
exists a real difference among individuals in 
susceptibility to traumatic noise deafness). 
Since the variances before and after have al- 
ready been calculated for other purposes, it 
would be an enormous saving in time if the 
before and after variances could be used 
directly as an F ratio with any necessary al- 
lowance for correlation due to paired readings. 


ANSWER: Since each ear has a measurement 
before and after exposure, the variances before 
and after may well be correlated. If so, the 


F test (described in this Bulletin, Vol. 1, 
page 70, query number 2) is not applicable 
because it rests on the assumption that the 
two estimates of variance are independent. 

For variances based on correlated variates, 
the Pitman and Morgan method of testing the 
null hypothesis is described by Cochran in 
the same number of the Bulletin cited above, 
query number 3. Since this method requires 
calculation of r, you can save some labor by 
the following method of computation: 


Before After 
Xi xi’ 
Xa Xo’ 

F 
Xk Xx’ 
Sx Sx 


Calculate the corrected sums of squares and 


products, Sx, S(x’)*, Sxx’. From these, 
y= __ Sx! 
VSx?-S(x')? 


Sd?=Sx?-+LS (x')?—2Sxx’ 
The correlation coefficient, r, is used in the 
Pitman and Morgan F test, while Sd* (the 


corrected sum of squares of the differences, 
x+x') enables you to complete the t test. 
G. W. Snedecor 


QUERY: In your answer to the first query of 
Vol. 1, page 70 of this Bulletin, you imply that 
the mean square group means should be 
divided by the mean square of individuals 
even though the quotient turns out to be less 
than 1.0. Have you decided not to use the 
tule, “Divide the larger mean square by the 
smaller”? 


ANSWER: Two objectives must be distin- 
guished. First, there is the reading of the 
table itself: since it contains no values less 
than 1.00, it can be entered only if F is greater 
than 1.00. This imposes the rule that goes 
with the table—divide the larger mean square 
by the smaller. There can be no deviation 
from this rule for using the table. 

Second, there is the use to be made of the 
table, and this is varied. In answer to the 
query cited, the assumption was that the ex- 
perimenter is interested in learning if his 
treatments serve to differentiate the popula- 
tion means. He is not concerned with a 
value of F less than that tabled for the prob- 
ability which he has selected. Smaller values 
of F, including those less than 1.00, do not 
constitute evidence for the effectiveness of the 
treatments. For this experimenter, the rule 
for entering the table is always sufficient. 

Another use for the table is the symmetri- 
cal or two-tailed test described in two answers 
contained in Vol. 1, pages 70-71, of the Bulle- 
tin. Again, the rule for entering the table is 
adequate, but the tabular probabilities must 
be doubled. 

A third use of the table is the one that 
seems to be in querist’s mind. He has mean 
squares for group means and for individuals, 
such as ordinarily arise in analysis of variance, 
but the former is the smaller. He evidently 
wishes to test the hypothesis that the mean 
square between groups is less than that within 
groups; that is, he wishes to make the asym- 
metrical test on the tail of the distribution 
with the small values of F. The nature of the 
distribution is such that this can be done 
easily: enter the table with the reciprocal of 
F (the larger variance divided by the smaller 


according to the rule), but interchange the 
degrees-of freedom. For example, if there 
are three groups, each with nine individuals, 
and if F=0.10, enter the table with F=1/0.10 
=10, mi=24 and m=2, the 5% value being 
19.45, 


Several times I have observed significantly 
small values of F but have never found a rea- 
sonable interpretation. In every instance the 
small value seemed to be one of those unusual 
ones that occur occasionally in sampling from 
a homogeneous population. Have any readers 
of this column encountered a significantly 
small value of F with a realistic meaning? 

_ G. W. Snedecor 


QUERY: We have the problem of attempting 
to determine the contribution of the sire to 
rate of laying as measured by average clutch 
size. For example, 29 sires mated to 42 dams 
of one class with respect to clutch size gave 
224 daughters. We have calculated the mean 
clutch size of the daughters of each sire. 
These means have been used to calculate a 
standard deviation that was weighted by using 
the frequency of daughters. From this a 
value of 0.4105 was obtained. The standard 
deviation of the population of 224 daughters 
was 1.0578. Using Pearl’s formula for the 
correlation ratio (1940 edition, page 429) the 
result is 0.4105/1.0578=0.3881. Does this value 
measure the correlation between sires and 
daughters? 


ANSWER: No. Although your data have a sup- 
erficial resemblance to those considered by 
Pearl, they are fundamentally different. He 
was considering two independently measured 
variates with y-arrays in equally spaced 
x-intervals. You have only one variate, the 
average clutch size of daughters. The total 
sum of squares may be partitioned between 
sires and daughters as follows: 


Source of Degrees of Mean 
variation freedom Square 
Total 223 
Sires 28 S 
Daughters of same sire 195 D 


The total mean square is (1.0578) °=1.1189 


(assuming that you divided by degrees of 
freedom, 223), but I am unable to judge from 
your description what the mean square for 
sires is. 

Assuming normal distribution in the sampled 
population, the completed analysis of variance 
leads to an estimate of intraclass correlation, 


5s—D_, 
$+ (ko—1)D 
in which ky is an average number of daughters 
per sire. If k represents the number of daught- 
ers for each sire, then 


foe Sk? 
ko= 55 (Sk— =) 

Note: Subsequent correspondence elicited 
the information that S=1.348, D=1.092 and 
ko=7.627. From these data the correlation 
0.030 was calculated. The value is not sig- 
nificant, since F=1.348/1.092—1.23 whereas 
F o5=1.53. 


QUERY: We are planning an experiment with 
insecticides to be applied to sweet corn for 
the control of the European corn borer. The 
degree of infestation cannot be predicted with 
any degree of certainty. Tentative arrange- 
ments for the field experiments are as follows: 

Three treatments (A, B and C) with check 
(X) are to be applied to 12 by 45 foot plots; 
that is, four rows each 45 feet long. The 
field plan is this: 


KOK OMS 
LM > Pw 
mm ps Wt 
TPO PS Pd 
MOM KO 
Pre wrMOM 


The large number of check plots is necessary 
to compensate for drift of insecticide from one 
plot to another. 

Samples are to be taken for the number of 
infested plants from 30 plants as near the 


57 


center of each plot as can conveniently be 
done. The effect of the insecticide is to be 
evaluated on the basis of the number of borers, 
the yield and the quality of the corn. 

All plots are to be planted to a variety of 
susceptible sweet corn, the planting to be as 
early as possible in order that a maximum 
infestation take place. 


ANSWER: For the evaluation of yield your 
design is not very efficient. The desirable 
arrangement is one in which the plots of a 
replication lie as close together as feasible 
so as to avoid large soil differences among 
them, the replication (block) itself being 
nearly square. To avoid drifting of the in- 
secticide and migration of the insects, greater 
separation is called for; hence, compromises 
have to be made. 

Migration of the corn borer larvae is not 
a complication, I am told, but allowance must _ 
be made for drift. Perhaps it will be sufficient 
to spray only the two inner rows of your 
four-row plots, leaving the outer rows, as well 
as the ends of the inner, to absorb the drifting 
insecticide from adjacent plots. Absence of 
migration makes this plan available. Evalu- 
ations would be made on some 70 plants in a 
space about 6x35 feet in the central portion 
of the plot. This would leave at least 9 feet 
for the drift to subside. If that is not enough, 
another row could be put in between the plots, 
increasing the separation to 12 feet. 

Such a design with four plots lying parallel 
would make the size of the replication 48x45 
or 60x45, either of which would be satisfactory. 
There would be sufficient plants per plot to 
evaluate yield. If infestation by the borer 
is rather uniform, not all the plants would 
have to be dissected: the plots could be sub- 
sampled for damage determination. I am 
assuming that in sweet corn, yield and infesta- 
tion can be measured on the same plants at 
harvest. 


G. W. Snedecor 


ABSTRACTS 


(22) 


STRANDSKOV, HERLUF H. and G. J. SIEMENS, 
(University of Chicago), An Analysis of the Sex 
Ratios Among Single and Plural Births in the Total, 
the “White” and the ‘Colored’ U, S. Populations. 
(To be published in the American Journal of Phys- 
ical Anthropology, 3:--, 1946). 


Fisher’s method of determining whether 
one variance is significantly greater than 
another is applied to certain U. S. Census 
data. The variances compared are the var- 
iances of percentages of male births over a 
15 year period and the corresponding variances 
expected due to chance during the same period 
of time. For nearly all of the distributions 
which are examined the observed variance is 
found to be significantly greater than that 
expected due to chance. 

The means of the percentages of males 
among single, twin, triplet and quadruplet 
births over a 15 year period are compared for 
significance of differences. It is found that 
the percentages of males decreases significant- 
ly with each increase in number of fetuses 
per pregnancy or as the mammalogist would 
say with each increase in size of litter. 

Racial differences in the percentage of 
males among the different types of births 
are tested and found generally to be signifi- 
cantly different. 


(23)* 
WADLEY. F. M. (USDA Bureau of Entomology nd 
Plant Quarantine), Incomplete Block Design 
Adapted to Paired Tests of Mosquito Repellents. 
The application of such a plan is described. 


There can be only two arms (“plots”) in the 
“block”, which is a subject on a given date. 
Analysis followed Yates’ standard method. 
The scheme worked well and gave a worth- 
while gain in efficiency, compared to the al- 
ternate plan of using a standard in each pair. 
It will probably be of value in such situations. 


(24) 
ANDERSON, R. L. (Institute of Statistics). The 
Analysis of Orthogonal Square Lattice Experi- 
ments with d duplications of the Basic Design. 
A generalized method of analyzing any 


square lattice experiment with k’ treatments 
put in blocks of k each is presented, provided 
the r replications in the basic designs are 


orthogonal. The basic design is duplicated d 
times. For a 6x6 triple lattice using 9 com- 
plete replications, k=6, r=3, and d=3. 
The methods of analysis follow those de- 
veloped by Yates and Cochran, utilizing any 
added information contained in the inter- 
block variance. Yates has denoted the weight- 
ing factor for making block adjustments to be 


w—w' 


seca w-+w' 


where eee the true intra-block error, and 
w 


he =o."+koy’, the true inter-block error. 
sat 


It is shown that the best estimates of w and 
w’ are (respectively) 


los rtl 


E, *"¢ di —E. 
where E, is the computed intra-block variance 
and E, the computed inter-block variance. 
E, is found by pooling components (a) and 
(b) in the analysis of variance. 


The average variance of the difference be- 
tween two adjusted treatment means is 


De] 
(25) 
MUHRER, M. E. and A. G. HOGAN. (University of 


Bi. 
a Ly 


2 


wdr 


Missouri). Effect of Goitrogenic Drugs on Fatten- 
ing Swine. Proc. Soc. Exp. Biol. Med. 60:211-212. 
1945. 


Studies were made of the effect of thiouracil 
and thiourea upon the economy of gain and 
morphology (body measurements) in swine. 
The thiouracil-fed animals made greater gains 
than either the thiourea-fed animals or the 
animals which received the basal ration alone. 
In the analysis of variance in gains by the 
method of Snedecor the value of F (ratio of 
the standard deviation squared between and 
within treatments) was 23.6 and greatly ex- 
ceeds the 1% point. Along with other body 
measurements the increase in height and depth 
were studied in relation to the increase in 
weight. From the analysis of variance it was — 


*Biometrics Bulletin, Vol. 2, No. 2, page 30. April 1946. 


58 


found that the ratio of the weight increase to 
height was more significant than either weight 
or height increase. However, the ratio of 
weight increase to depth increase was not 


significant. The thiouracil kept the animals 
significantly shorter than the control animals 
but affected the depth increase only through 
affecting the size of the animal. 


NEWS AND NOTES 


U. S. NAIR, Head of the Department of 
Statistics, Travancore University, Trivandrum, 
South India, writes, “You will be interested 
in knowing that a Statistical Laboratory has 
been created in our University where post 
graduate tuition in statistics is given; also, 
research fellows work in the Laboratory. The 
post graduate course consists of two years 
during which time intensive training in mathe- 
matical analysis, theoretical and _ practical 
statistics will be given. The subjects in ap- 
plied statistics include design of experiments, 
factor analysis, statistical physics, biometry 
and certain aspects of mathematical econom- 
ics.’ He urges us to send publications which 
will be helpful to him, “living as we do in 
one of the remotest corners of the world.” .. . 
Word has been received from V. G. PANSE, 
Institute of Plant Industry, Indore, Central 
India. He has published an account of sample 
surveys carried out for estimating the yield of 
commercial crops of cotton... D. D. KOS- 
AMBI, Professor of Mathematics, is with the 
Tata Institute of fundamental research, Bom- 
bay 26, India ... The Proceedings of the 
33rd Indian Science Congress, Bangalore 1946 
shows a very active Section on Statistics. K. 
B. MADHAVA is president of the Section. 
He is a University Professor of Mysore who 
is now on foreign service with the government 
of India, Transport Department, New Delhi. 
The title of his Presidential Address was 
“Statistics gets firmly woven into our fabric 
of thinking”—a most interesting plea for sta- 
tistics. He writes, “through sheer power of 
logic, the statistical method has secured a 
place for itself in all fields of thought.” There 
were 47 papers listed in the report of the 
Abstracts of papers discussed at the Congress. 
The papers were grouped under theoretical, 
agricultural, economic, vital and general sta- 
tistics. P. K. BOSE and P. C. MAHALAN- 


59 


OBIS, Presidency College, Calcutta, and A. R. 
SEN, Lucknow were the speakers in the agri- 
cultural group. In the vital statistics section, 
C. CHANDRA SEKAR, Calcutta, reported on 
“Reproductive wastage and infant mortality 
as obtained from the records of some maternity 
and Child Welfare centers in Calcutta.” An- 
other paper was by K. K. MATHEN and R. 
B. LAL, Calcutta, dealing with “Studies in 
the health problems of a rural community in 
western Bengal. Part I. Population prob- 
lems” ... The Deputy Director of Agricul- 
ture, (Crop Research) Bombay Province is 
V. M. CHAVAN, College of Agriculture, Poo- 
na... P. V. SUKHATME is Statistical Ad- 
viser, Imperial Council of Agricultural Re- 
search, New Delhi... R. J. KALAMKAR is 
an Officer on Special Duty, Office of the Di- 
rector of Agriculture, Central Provinces, Nag- 
pur... R. C. BOSE is Head of the Post- 
graduate Department of Statistics, Calcutta 
University, Calcutta, India . . . B. M. PUGH, 
Professor of Agronomy at Allahabad Agricul- 
tural Institute and Editor of the Allahabad 
Farmer intends to leave India in July to visit 
the United States. He will be visiting the 
agricultural colleges of this country 

JOSEPH CARMIN sends an announcement 
regarding the Independent Biological Labora- 
tories, Kefar-Malal, Ramatayim, Palestine. 
“This announcement is intended to let us know 
that we went safely through the war, that we 
are continuing our work, and that we are try- 
ing now to get in touch anew with institutions 
and scientists abroad.” They were forced to 
leave their precious domicile at Tel-Aviv and 
move in great haste to the country. Buildings 
have been erected now to house their library, 
collections, apparatus and 12 research work- 
ers. “Research was going on uninterruptedly 
all the time of the war on utilization of plant 
and animal material for the manufacture of 


different commodities and a special consulting 
bureau was established in this line. Work 
was continued on previous lines in different 
problems of ecology, genetics and physiology 
of plants and animals, entomology, phytopath- 
ology and marine biology” . . . Other biolog- 
ical research workers that we have heard from 
recently are P. S. OSTERGAARD who is 
with the Agricultural Research Laboratory, 
Copenhagen, Denmark; HALVDAN  AS- 
TRAND, chief statistician of the Swedish 
Sugar Company, Arlov; S. BERGE, professor 
of animal breeding and genetics at the Agri- 
cultural College of Norway, Aas; A. SCIU- 
CHETTI also in the field of animal breeding 
and genetics at the Agriculture School, Plan- 
tahol, Landquart (Switzerland); and S. H. 
JAYEWICKREME, Division of Medical En- 
tomology, Colombo, Ceylon ... ALVIN KE- 
ZER, chief agronomist at Colorado A. & M. 
College lets us know about the Diamond An- 
niversary of his school ... K. M. AUTREY 
is now Associate Professor of Dairy Husban- 
dry at The Pennsylvania State College ... 
Lt. Colonel JAMES H. BYWATERS returned 
to the U. S. Regional Laboratory at East Lan- 
sing, Michigan, early in May. While on 
terminal leave, he and Mrs. Bywaters visited 
J. HOLMES MARTIN, Head of Department 
of Poultry Husbandry, Purdue University. 
Think they also visited Iowa State College ... 


Officers of the American Statistical Associa- 
tion: President, Isador Lubin; Directors, Ches- 
ter [. Bliss, E. Grosvenor Plowman, Walter A. 
Shewhart, Samuel A. Stouffer, Willard L. 
Thorp, Helen M. Walker; Vice-Presidents, F. 
L. Carmichael, S. S. Wilks, Dorothy Swaine 
ioe Secretary-Treasurer, Lester S. Kel- 
ogg. 

Officers of the Biometrics Section: Chairman, 
D. B. DeLury; Secretary, H. W. Norton; Sec- 
tion Committee members; E. J. deBeer, A. E. 
Brandt, J. W. Fertig, J. G. Osborne, J. W. 
Tukey. 


a/¢ 


WALTER C. JACOB, recently Lt. Commander 
with the Bureau of Ships Statistics Section, 


Island Vegetable Research Farm near River 
head, Long Island . . . CHARLES M. MOTT- 
LEY has “reverted to my old field of work in 
fishery biology.” He is Chief of the Sectio: 
of Eastern Agricultural Investigations in the 
Interior Department . . . RUTH R. PUFFER, 
director of Statistical Service of the Tennessee 
Department of Public Health, has been gran 
ed a leave of absence to serve as a visiting 
professor in the School of Public Health of 
the University of Chile for the term, June: 
August, 1946. In addition to her duties in 


sponsorship of the Rockefeller Founda’ 
which has been instrumental in the establish. 
ment of this School of Public Health . 
FRANK A. WECK was a Captain with th 
Office of the Surgeon General. He is with tl 
Acturial Division, Metropolitan Life Insurat 


who was in the Department of Mathem 
at the University of Oregon has recently 
appointed Mathematician with the 
Army-Navy Air Intelligence Division. 


Editorial Committee for the Biometriill : 
letin: Chairman, Gertrude Cox; mem 
L. Anderson, C. I. Bliss, W. G. C 
Churchill Eisenhart, H. W. Norton, G 
Snedecor, C. P. Winsor. 


Material for the BULLETIN should be 
dressed to the Chairman of the Editorii 
mittee, Institute of Statistics, North 
State College, Raleigh, N. C., materi 
Queries should go to “Queries,” Stati 
Laboratory, Iowa State College, Ames, 1 
or to any member of the committee. 


5 


60 


